Home |
| Latest | About | Random
# 16 Solution sets of a linear system $A\vec x = \vec b$. By now, we should be pretty comfortable in the technique of solving linear systems $A\vec x = \vec b$ -- it amounts to row reducing the augmented matrix $[\ A \ \vdots \ \vec b\ ]$ to an echelon form, and look at the locations of the pivots, then, if consistent, identify the free and pivot variables and solve. This is great, but what we like to do in math is to step back and think about what do answers to $A\vec x = \vec b$ look like, perhaps we can describe the **structure** of the solution sets of $A\vec x = \vec b$. And this is what we will do. --- Let us briefly review what we know. Recall that any linear system of $n$ equations over $k$ unknowns can be expressed as a matrix-vector equation $$A\vec x=\vec b ,$$where $A$ is an $n\times k$ matrix, $\vec b$ an $n\times 1$ column vector, and $\vec x$ an $k\times 1$ unknown vector to be solved. Since this is a linear system, we already know the three possible outcomes, which we restate it here: > For $A\vec x=\vec b$, where $A$ is $n\times k$ and $\vec b$ is $n\times 1$, and $\vec x$ some unknown vector to be solved, this system is either > (1) Consistent with unique solution. > (2) Consistent with multiple solution (infinitely many if over $\mathbb{R}$ or $\mathbb{C}$) > (3) Inconsistent with no solution. And we know precisely when we have each scenario, by examining the augmented matrix $[\ A\ \vdots \ \vec b\ ]$ and an echelon form EF that it is row equivalent to: > If $[\ A\ \vdots \ \vec b\ ]$ is row equivalent to an echelon form $E$, > (1) If $E$ does not have a pivot in the last augmented column, and $E$ has no free variable columns, then $A\vec x = \vec b$ has unique solution. > (2) If $E$ does not have a pivot in the last augmented column, and $E$ has a free variable column, then $A\vec x=\vec b$ has multiple solutions (infinitely many if $\mathbb{R}$ or $\mathbb{C}$). > (3) If $E$ has a pivot in the last augmented column, then $A\vec x= \vec b$ has no solution. Note this reduces to checking where the pivots are in this echelon form $E \stackrel{\text{row}}\sim[\ A\ \vdots \ \vec b\ ]$. --- Ok, great. But we now would like to say more. What do the solution sets look like when the system $A\vec x = \vec b$ is consistent? Let us consider some terminologies first. When $\vec b=\vec 0$ is a zero vector (all entries zeros), the linear system $A\vec x = \vec 0$ is said to be **homogeneous**. If $\vec b\neq\vec 0$ (not all zeros), then $A \vec x = \vec b$ is said to be **nonhomogeneous**. A quick observation we can make: A homogeneous system $A\vec x=\vec 0$ is **always consistent**, since it always admit a zero vector $\vec x = \vec 0$ as a solution (of an appropriate size). This solution $\vec x = \vec 0$ is called the **trivial solution**. Any nonzero solution to $A\vec x= \vec 0$ is called a **non-trivial solution** and of course we would like to know when does the homogeneous equation $A \vec x = \vec 0$ has non-trivial solutions. And with some brief thought, we know the answer: > $A\vec x = \vec 0$ has a non-trivial solution if and only if it has a free variable. And correspondingly (rephrasing above): > $A\vec x = \vec 0$ has only the trivial solution (and hence unique solution) if and only if it has no free variables. Let us think about why. The augmented matrix to $A\vec x = \vec 0$ is $[\ A\ \vdots \ \vec 0 \ ]$, which no matter how you row reduce, the augmented column will remain zero. So no pivot can ever occurs in the augmented column, so it is always consistent. Now, if there is a free variable, then we can set the free variable to some nonzero value, giving a non-trivial solution. Conversely, if it does not have a free variable, imagine the pivot structure in its echelon form: We will be able to solve each variable one at a time and they will all equal to zero, giving just the trivial solution! --- Now let us look at the solution set to the homogenous system $A\vec x = \vec 0$. If there are no free variables, then the solution set is simply $\{\vec 0\}$, a singleton set with just the trivial solution. Note, we can re-write this (perhaps trivially so right now) as $\operatorname{span}(\vec 0)$. For reasons to be more clear later, we will make a definition: We define the span of the empty set to also be $\{\vec 0\}$, that is $\operatorname{span}(\varnothing)=\operatorname{span}\{\ \}:=\{\vec 0\}$. Keep this in mind. If there are $\ell$ free variables, say $t_{1}, t_{2},\ldots,t_{\ell}$, then we can express the solutions as some linear combination $t_{1} \vec v_{1} + t_{2}\vec t_{2}+\cdots+t_{\ell}\vec v_{\ell}$, where $t_{1},\ldots,t_{\ell}$ free, and $\vec v_{1},\ldots,\vec v_{k}$ non-zero vectors. This means the set of solutions is just all such linear combination, namely $$ \operatorname{span} (\vec v_{1},\vec v_{2},\ldots,\vec v_{\ell}). $$ Let us see an example that illustrates this. **Example.** Find all solutions for $\vec x$ to the matrix-vector equation $$ \begin{bmatrix} 1 & 3 & 2 & 3 \\ 1 & 2 & 1 & 1 \\ 2 & 5 & 3 & 4 \end{bmatrix} \vec x = \begin{bmatrix} 0\\0\\0 \end{bmatrix}, $$ and express the set of solutions as a span of some vectors. $\blacktriangleright$ We set up the augmented matrix and row reduce to an echelon form: $$ \begin{bmatrix} 1 & 3 & 2 & 3 & \vdots & 0 \\ 1 & 2 & 1 & 1 & \vdots & 0 \\ 2 & 5 & 3 & 4 & \vdots & 0 \end{bmatrix}\stackrel{\text{row}}\sim \begin{bmatrix} 1 & 3 & 2 & 3 & \vdots & 0 \\ 0 & -1 & -1 & -2 & \vdots & 0 \\ 0 & -1 & -1 & -2 & \vdots & 0 \end{bmatrix}\sim \begin{bmatrix} 1 & 3 & 2 & 3 & \vdots & 0 \\ 0 & -1 & -1 & -2 & \vdots & 0 \\ 0 & 0 & 0 & 0 & \vdots & 0 \end{bmatrix}. $$And if we organize the variables as $\vec x = \begin{bmatrix}x_{1}\\x_{2}\\x_{3}\\x_{4}\end{bmatrix}$, then we have $x_{3}, x_{4}$ as free variables. Solving the pivoted variables in terms of the free variables, we have $$ \begin{align*} x_{2}&=-x_{3}-2x_{4} \\ x_{1}&=-3x_{2}-2x_{3}-3x_{4} \\ &=-3(-x_{3}-2x_{4})-2x_{3}-3x_{4} \\ &=x_{3}+3x_{4} \end{align*} $$So the solution vector is given by $$ \vec x = \begin{bmatrix} x_{3}+3x_{4} \\ -x_{3}-2x_{4} \\ x_{3} \\ x_{4} \end{bmatrix}=x_{3}\begin{bmatrix} 1\\-1\\1\\0 \end{bmatrix}+x_{4}\begin{bmatrix} 3\\-2\\0\\1 \end{bmatrix}\quad\text{where \(x_3,x_4\) free.} $$In other words, the solution set to this linear system is $$ \operatorname{span} \{\begin{bmatrix} 1\\-1\\1\\0 \end{bmatrix}, \begin{bmatrix} 3\\-2\\0\\1 \end{bmatrix} \}. $$Note how this system has $2$ free variables, and the solution set is a span of two non-zero vectors. $\quad\blacklozenge$ We can summarize this result as follows: > **Theorem. Structural theorem of solutions to homogeneous linear systems.** > The homogeneous linear system $A\vec x=\vec 0$ has a solution set of the form $\operatorname{span}(\vec v_{1},\ldots,\vec v_{\ell})$, spanned by $\ell$ many nonzero vectors where $\ell$ is the number of free variables of the system. This statement can be made even more precise later. We will however make one new definition here: Let us call the set of all solutions to $A\vec x =\vec 0$ the **nullspace of $A$**, and we can write it as $\text{nullspace}(A)$ or $NS(A)$. Mathematically, $$ \text{nullspace}(A)=\{\vec x : A\vec x=\vec 0\}. $$ So our result says, the nullspace of a matrix $A$ is the span of some set of vectors. --- Now let us turn our attention to the nonhomogeneous system $A\vec x = \vec b$. Of course, such a nonhomogeneous system is not always consistent. Recall that, > The linear system $A\vec x = \vec b$ is consistent if and only if $\vec b$ is a linear combination of the columns of $A$ (namely $\vec b$ is in the span of the columns of $A$). Since we quite often refer to the span of the columns of $A$, we also give this a name, called the **columnspace of $A$**, where we denote it as $\text{columnspace}(A)$ or $CS(A)$. If $\vec a_{1},\vec a_{2},\ldots,\vec a_{k}$ are the columns of $A$, then $$ \text{columnspace}(A)=\operatorname{span} (\vec a_{1},\vec a_{2},\ldots,\vec a_{k}). $$ Now suppose that $A\vec x = \vec b$ is consistent (so $\vec b\in\text{columnspace}(A)$), what can we say? As it turns out we have this pervasive theorem: > **Theorem. Structural theorem of solutions to nonhomogeneous linear systems.** > If $A\vec x=\vec b$ is consistent, then there exists a **particular solution** $\vec x_{p}$ to this nonhomogeneous system. And each solution $\vec x$ to this nonhomogeneous system is **precisely** a vector of the form $$ \vec x= \vec x_{h} + \vec x_{p} $$where $\vec x_{h}$ is **some homogeneous solution** to the equation $A\vec x =\vec 0$. And all vectors of this form is a solution to this nonhomogeneous system. (Students who have taken a course on elementary ordinary differential equation, does this look familiar?) What this means is, if $A\vec x = \vec b$ is consistent, then take the set of all homogeneous solutions, namely $NS(A)$, and then translate them by any fixed particular solution $\vec x_{p}$ to $A\vec x = \vec b$. That is, $NS(A)+\vec x_{p}$ forms the solution set to $A\vec x = \vec b$. **A notation**. If $S$ is a set of things where each element in $S$ can be added to some element $t$, then we write $S+t$ is the set $$ S+t=\{s+t:s\in S\}. $$So if $S=\{1,2,3\}$, then $S+5=\{6,7,8\}$. So $NS(A)+\vec x_{p}$ is the set formed by taking each element in $NS(A)$ and add to it $\vec x_{p}$. **Example.** Let us find the set of solutions to $A\vec x = \vec b$ where $A = \begin{bmatrix}1 & 2 & 5\end{bmatrix}$ and $\vec b = [3]$. Let us approach it two ways. First by what we have been doing, and second by appealing to the theorem: Approach 1. Let us write $\vec x = \begin{bmatrix}x_{1}\\x_{2}\\x_{3}\end{bmatrix}$. Note the augmented matrix to this system is just $\begin{bmatrix}1 & 2 & 5 & \vdots & 3\end{bmatrix}$, which is already in echelon form. We have $x_{2},x_{3}$ free, and that $x_{1} = 3-2x_{2}-5x_{3}$. This gives $$ \vec x = \begin{bmatrix} x_{1} \\ x_{2} \\ x_{3} \end{bmatrix}=\begin{bmatrix} 3-2x_{2}-5x_{3} \\ x_{2} \\ x_{3} \end{bmatrix}=x_{2}\begin{bmatrix} -2 \\ 1 \\0 \end{bmatrix}+x_{3}\begin{bmatrix} -5 \\ 0 \\1 \end{bmatrix} + \begin{bmatrix} 3 \\ 0 \\0 \end{bmatrix}\quad\text{where \(x_2,x_3\) are free.} $$ Approach 2. With $\vec x = \begin{bmatrix}x_{1}\\x_{2}\\x_{3}\end{bmatrix}$, the system $A\vec x = \vec b$ amounts to the equation $x_{1}+2x_{2}+5x_{3}=3$. By inspecting, we see that $\vec x_{p} = \begin{bmatrix}3 \\0\\0\end{bmatrix}$ solves this system $A\vec x = \vec b$. (**Note there are many choices**!) Now the homogeneous version of this problem $A\vec x = \vec 0$ has an augmented matrix $\begin{bmatrix}1 & 2 & 5 & \vdots & 0\end{bmatrix}$, which is already in echelon form. We have $x_{2},x_{3}$ free, with $x_{1} = 3-2x_{2}-5x_{3}$. So a homogeneous solution has the form $$ \vec x_{h} =x_{2}\begin{bmatrix} -2 \\ 1 \\0 \end{bmatrix}+x_{3}\begin{bmatrix} -5 \\ 0 \\1 \end{bmatrix}\quad\text{where \(x_2,x_3\) are free.} $$And using the theorem above, the general solution to $A\vec x = \vec b$ has the form $\vec x_{h} + \vec x_{p}$, namely $$ x_{2}\begin{bmatrix} -2 \\ 1 \\0 \end{bmatrix}+x_{3}\begin{bmatrix} -5 \\ 0 \\1 \end{bmatrix} + \begin{bmatrix} 3 \\ 0 \\0 \end{bmatrix} \quad\text{where \(x_2,x_3\) are free.} $$And to describe this set of solutions to $A\vec x = \vec b$, we can write $$ \operatorname{span} \{ \begin{bmatrix} -2 \\ 1 \\0 \end{bmatrix},\begin{bmatrix} -5 \\ 0 \\1 \end{bmatrix} \} +\begin{bmatrix} 3 \\ 0 \\0 \end{bmatrix}. $$Note well that this cannot be written as a single span of a set of vectors -- it is a span of a set of vectors and translated by $\vec x_{p} = \begin{bmatrix}3 \\ 0 \\0\end{bmatrix}$! $\blacklozenge$ **Remark.** Here the particular solution $\vec x_{p}$ is just any one solution to $A\vec x = \vec b$. It doesn't have to be "normal" or "perpendicular" to whatever plane or what not -- it just needs to solve $A\vec x = \vec b$. And yes **there are many choices** of $\vec x_{p}$ you can make here, but **no matter which one you choose** to translate the homogeneous solution set, you will end up with the same set of solutions to $A\vec x= \vec b$! But why? This deserves a proof, presented here in [[smc-spring-2024-math-13/linear-algebra-notes/16a-proof-of-the-theorem-of-solutions-to-nonhomogeneous-systems|notes 16a]]! Read it and we will discuss it next class time. In this above example, we can geometrically describe what is happening. The solutions to the homogeneous problem $A\vec x = \vec 0$ forms a plane through the origin, while the solutions to the nonhomogeneous problem $A\vec x = \vec b$ is this same plane but translated by $\vec x_{p}$. We have as diagrammatically below: ```tikz \begin{document} \begin{tikzpicture}[x={(-0.8cm,-0.4cm)}, y={(1cm,0cm)}, z={(0cm,1cm)}] % Axes \draw[gray,->] (-2,0,0) -- (2,0,0) node[anchor=north east]{\(x\)}; \draw[gray,->] (0,-2,0) -- (0,2,0) node[anchor=north west]{\(y\)}; \draw[gray,->] (0,0,-2) -- (0,0,2) node[anchor=south]{\(z\)}; % two vectors \filldraw[blue!30, opacity=0.4] (2,-0.5,1.5) -- (2,-0.5,3) -- (-2,0.5,-1.5) -- (-2,0.5,-3) -- cycle; \node[blue] at (-1,0,-2.2) {solutions to \(A\vec x = \vec 0\)}; \filldraw[red!30, opacity=0.4] (3,1,2.5) -- (3,1,4) -- (-1,2,-0.5) -- (-1,2,-2) -- cycle; \draw[black,thick,->] (0,0,0) -- (1,1.5,1); \node[black, anchor = north west] at (0.5,0.75,0.75) {\(\vec x_p\)}; \node[red] at (1,4,-0.5) {solutions to \(A\vec x = \vec b\)}; \node[blue] at (1,-1,1) {\(NS(A)\)}; \node[red] at (0,-0.5,2.5) {\(NS(A) + \vec x_p\)}; \end{tikzpicture} \end{document} ``` Bottom-line: If the nonhomogeneous system $A\vec x=\vec b$ is consistent, then it has a solution set that is a **parallel translation** of the homogeneous solutions to $A\vec x = \vec 0$ (namely, the nullspace of $A$), and the amount of translation is **any** particular solution $\vec x_{p}$ to $A\vec x = \vec b$! This is an important idea that shows up in many "linear algebra"-esque context.